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Abstract. Modularity was introduced as a measure of goodness for the com- 
munity structure induced by a partition of the set of vertices in a graph. Then, 
it also became an objective function used to find good partitions, with high 
success. Nevertheless, some works have shown a scaling limit and certain in- 
stabilities when finding communities with this criterion. 

Modularity has been studied proposing several formalisms, as hamiltonians 
in a Potts model or laplacians in spectral partitioning. In this paper we present 
a new probabilistic formalism to analyze modularity, and from it we derive an 
algorithm based on weakly optimal partitions. This algorithm obtains good 
quality partitions and also scales to large graphs. 
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1. Introduction 

Finding communities is an important issue in complex systems, it is useful to clas- 
sify and even to predict properties in biology or groups in sociology. A very success- 
ful method to find communities was based on betweenness |Freeman, 1977| . This di- 
visive clustering method led to the problem of choosing a stopping criteria. So New- 
man introduced the modularity in |Newman and Girvan, "2004] and |Newman, 2004] 
as a measure of goodness of such partitions. This notion has shown to be rich from 
the theoretical viewpoint, and in practice it provided a unifying tool to compare 
partitions obtained by a diversity of methods. On the other hand, several methods 
have been devised to obtain partitions directly by modularity optimization. This 
problem has been shown to be NP-hard, and many of the algorithms developed 
to approach the optimum are diverse adaptations of some known algorithms for 
these problems, with the notable exception of Blondel et al. |Blondel et al., 2008 . 
From a theoretical viewpoint, and despite the complexity problem, modularity op- 
timization has been shown to have some strong limitations, driving to partitions 
that do not conform to other intuitive or formal notions of community structure. 
These limitations are related to the scaling behavior of modularity, that causes long 
correlations in community structure, and unnatural seizes of communities. 



In this paper we introduce, as in |Reichardt and Bornholdt, 2006 , a generaliza- 



tion of the modularity function for weighted graphs, with a resolution parameter t. 
We give first some properties of this generalization analogous to known properties 
of the usual version. Then we introduce a notion of weak optimality of a partition 



J.R. Busch, M. G. Beiro and J.I. Alvarez-Hamelin are with Facultad de Ingenieria, Univer- 
sidad de Buenos Aires, Paseo Colon 850, - C1063ACV - Buenos Aires - Argentina. E-mail: 
{jbusch,mbeiro}(Qfi.uba.ar, ignacio.alvarez-hamelin@cnet.fi.uba.ar 

J.I. Alvarez-Hamelin is also with INTECIN, U.B.A. and CONICET (Argentine Council of 
Scientific and Technological Research). 

1 



2 



JORGE R. BUSCH, MARIANO G. BEIRO, AND J. IGNACIO ALVAREZ-HAMELIN 



and we study some properties of this notion, using our tools to put new light on 
some of the general limitations of modularity. We address the scaling limit problem 
for weakly optimal partitions, and we show some of its effects for some examples on 
binary trees. Finally, we describe a fast algorithm that gives weakly optimal par- 



titions, explore its similarities with Blondel et al., 2008 , and compare the results 



with those obtained by other means. The result of this comparison is rather surpris- 
ing: the values of modularity that we obtained for standard graphs are comparable, 
and in several cases better, than those obtained by other means. Of course this 
suggests that there is a stronger relation between weak optimality and optimality, 
explaining the performance of our algorithm and of |Blondel et al., 2008] (they also 
obtain weakly optimal partitions). This point deserves further investigation. 

This paper is organized as follows. We introduce some probabilistic definitions in 
Section [2] and we analyze the consequences in Section [3l The next section presents 
our algorithm. We provide proofs for the lemmas in Section [SJ Real complex 
networks are analyzed in Section [6] concluding our work in Section [7] 



2. Definitions 

2.1. Some measures. Let V be a finite set, and m : Vx V — > Z + be a non-negative 
integer function such that Z — J^i r m (l, r ) > 0. We assume throughout this work 
that m is, in addition, symmetric, that is m(l, r) = m(r, I) for (r, I) G V x V, 
and that ^ r m(Z, r) > for each I € V. Then, we consider the oriented graph 
G = G(V, E) whose vertices are the elements of V, and whose edges are the pairs 
(l,r) £ V x V such that m(l,r) > 0. That is, G provided with m is a weighted 
oriented graph, with the property that if (l,r) € E then (r, I) € E. There can be 
isolated points in G, but if v is isolated then m(v, v) > and there is a loop in v. 
We define a probability measure vhe in V x V by 



m E (l,r) = 




and additivity. We consider the marginal probabilities defined in V by 

m L (l) = 22<mE(l,r) 

r 

m R (r) = }^m E {l,r) 
i 

and the product probability m^R defined in V x V by 

m LR (l,r) = m L (l)m R (r) 

and additivity. Finally, for t > we shall consider the signed measure fit in V X V 
given by 

l«t(5) = m E {S) - tm LR (S) 

for S C V x V. By the assumed symmetry of m, we have that mj, ~ iur, and we 
denote this marginal probability measure by my, and mj,j = raw Thus 

Ht(S) = m E (S) - tm vv (S) 
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2.2. Partitions. We shall consider partitions C of V, meaning a family of pairwise 
disjoint not empty sets C C V such that UcecC = V. We shall consider the usual 
(lattice) partial order between partitions of V, C ^ C if C is a refinement of C, or, 
which is the same, for any C £ C it holds 

C = UC' C 

where C' c = {C G C : C C C}. Notice that with this partial order, there is always 
a minimal partition Cq = {V} and a maximal partition Ci = {{v} : v £ V}. 

Given a partition C of V , we associate to it a set of diagonal pairs (Z, r) € x V, 

by 

£>(C) = UcecC x C 
and the set of off diagonal pairs 

D{C) = VxV\ D(C) = U c ,c>eC,c?c>C x C" 

Consider a partition C of V, and define c : — > C by c(«) = C if u € C. Consider 
then the quotient graph G/C, whose vertices are the elements of the partition, with 
weights defined by ml = m/C : C x C — » Z + by 

m'(C,C')= m(v,v') 

Then, we obtain a signed measure fj/ t in C x C. Of course, if 5' C C x C and 
S = {(v,v') eV xV : (c(v), c(v')) G S"}, then 

,4(5') = Mt(5) 

Remarks 1. Typically m will be the adjacency matrix of G. If we admit more 
general weights in our description it is to include in our framework this quotient 
graphs and the corresponding measures. This will show to be useful in the analysis 
of our algorithm, where we construct partitions starting from the maximal partition 
C\ and advancing through smaller and smaller partitions by iteratively joining two 
of their elements (see Remarks^). 

2.3. Modularity. Now we define the modularity QtiC) at resolution t > of a 
partition C by 

Q t (C)=MD(C)) 

and its complement 

Qt{C) = Ht{D{C)) 

(see Figured]) 

References 1. If m(v,w) is the adjacency matrix of G and t = 1, then Qt(C) 
is the usual Newman-Girvan modularity (see for example Newman, 2006 ). For 



weighted graphs and t = 1, it was defined in |Newman7 2004j (in this paper it is 
assumed that m(v, v) — for v G V). If we put 7 = t, we obtain the generalization 
of the modularity introduced in Reicha rdt and Bornholdt, 2006| ( where m is the 
adjacency matrix). There is a subtle difference between our formalism and the one 
in this last paper: we represent the graph G and the weights m by the probability 
measure the (in this general setting this idea is, of course, not new: it is at the very 
origin of random graph theory), obtain the difference with the null model probability 
raw at the probability level, which gives fit, and then apply it to D{C) to obtain the 
modularity. Instead, in [Re ichardt and Bornholdt, 2006| the authors take means in 
the null model to bring it to the graph level, and they make the differences at this 
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Figure 1. Here we illustrate the set D(C) associated to a parti- 
tion C = {A, B,C, D}. V is plotted in the interval [0, 1], and we 
associate with each element X in the partition an interval of length 
mv(X). With this settings, you may think of myv as the area, 
and of uie as another symmetric probability measure in the same 
square. 



level to obtain the Hamiltonian. We hope that our approach will help intuition and 
analysis, because it puts emphasis in the additive nature of fit ■ 

Notice that the Newman- Girvan modularity is intimately related to Jacob Cohen's 
measure of agreement (1960) (see Bish op et al., 2007| , Chap 11). The statistical 
usage of this measure justifies the widely used terminology "null model" for the 
measure myv ■ 

2.4. Optimality. We call a partition C* optimal for Q t when Q t (C) < Qt{C*) for 
any other partition C. 

We call a partition C* weakly optimal for Q t when Qt(C) < Qt(C*) for any 
partition C such that C <C* . 

We call a partition C* positive for [x t when ft(C X C) > whenever C is in C. 

We call a partition C* submodular for [i t when /it(C x C") < whenever C 
and C are different sets in C. When C is submodular, we shall call its elements 
communities. 

We call a partition C internally connected when G{C) (i.e. the subgraph of G 
induced by C) is connected for all C EC. 

References 2. The problem of Q\ optimization has been shown to be NP-complete 

(see Brandes et al., 2008]^). 

In |Reichardt and Bornholdt, 2006| , the terms Z/j, t (CxC) and Z 'm(C xC'),C ^ 
C' are called cohesion and adhesion respectively. We shall not make further usage 
of this terminology. 



3. Some consequences 
3.1. Some useful relations. 



Lemma 1. Let C be a partition of V . Then 
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(i) For any C E C, 

Ht(C x C) + /xt(C x (V \ C)) = (1 - t)m v {C) 

(ii) Q t {C) + Q t [C) = 1 - t 



References 3. See Equation 14 and its context in Rcichar dt and Bornholdt, 2006| 
for a discussion of these relations. 

3.2. Relations between optimality notions. 

Lemma 2. Let C be a partition of V , and let C,C E C be different. Let T> be the 
partition obtained from C by replacing C and C by C U C , that is 



Then 



(see Figured]) 



V = (C \ {C, C'}) U {C U C'} 
Qt(D)=Q t (C)+2(x t (CxC) 




Figure 2. Here we illustrate Lemma [5] The terms associated in 
Qt(C) to C and C correspond to the black squares. When you 
join this sets to obtain T>, you replace these two terms by one, 
associated to the square formed by the black squares and the grey 
rectangles. The additivity and the symmetry of \it make the rest. 



Lemma 3. 

(i) If C* is optimal, it is weakly optimal. 

(ii) C* is submodular for fi t if and only if it is weakly optimal for Q t . 

(iii) If C* is submodular for [i t and t < 1, then C* is positive for Q t . 

References 4. Lemma\^ which in our framework is an immediate consequence of 
the additivity of /i t , is a key tool in 



Fortunato and Barthclcmy, 2007 (see Equation 



15 in this paper), in Reichardt and Bornholdt , 2006 (see Equation 5 in this paper) 



and in |Kumpula et al., 2007| (see Equation 7 in this paper). 

The relation between optimality and submodularity is addressed in Rcichardt and Bornholdt, 2006 
(see Equation 19 and its context in this paper). 
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References 5. This is to justify the use of the term submodular. A real set 
function p defined in a family T> of sets, closed under unions and intersections, is 
called submodular when 

p,(X U Y) + fi(X DY)< fi(X) + n(Y) 

for X, Y G D (see Fujishig e, 20051 ^). If C is a partition ofV, and T> is the family 
formed by the unions of elements of C, then the set function defined by 

X i-> [j, t (X x X) 

is submodular in T> when C is submodular for p t according to our definition. 

Lemma 4. Let t > and let C be any partition ofV. Let, for each C € C, T>c be 
the partition of C associated to the connected components of G{C). This defines a 
partition T> of V . Then T> is internally connected and Qt{T>) > Q t (C). 

References 6. This useful result means that when we look for optimal partitions, 
we can restrict our search to internally connected partitions. It generalizes Lemma 
3.4 in |Brandes et al., 2008| . 

3.3. Basic inequalities for Q t . Denote 

p(C)=m E (Cx(V\C)) 

Then we have 

Lemma 5. If0<t and C C V, then 

(1) m v (C) = m E (C xC)+ p{C) < m E (C x C) + 2p(C) < 1 

(2) fi t {CxC) = m E (C xC)(l-t(m E (C xC) + 2p(C)))-tp 2 (C) 

(3) nt{CxC) < m E (C x C)(l - tm v (C)) 

(4) MCxC) < m E (C x C)(l - 2tp(C)) 
and, if in addition t < 1. then 

(5) Pt{CxC) > -tp 2 {C) 
(see Figure [3]) 

Lemma 6. Let C be a partition of V , and < t, then 

(6) Q t (C) < l-tJ2m 2 v (C)<l-t/\C\ 

cec 

(7) Q t {C) < m E (D{C))(l-2t mm p(C)) 
and, if in addition t < 1, then 

(8) Q t (C) > (-t/2)(l-m E (D(C))) 

References 7. Suppose that m(v,w) is the adjacency matrix of G and t = 1. 
Then Qt{C) is the usual modularity of the partition C. The inequality in\E[ gives 
then Qt(C) < 1 — 1/|C| < 1 (see |Brandes et al., 2008] , Lemma 3.1 and Corol- 
lary 6.4, and |Fortunato and Ba rthclcmy, 2007], Fla. 11). In this case the addi- 
tional hypothesis for Equation^ is true, and the inequality gives Qt{C) > — (1 — 
m E (D(C)))/2 which (as 1 — m E {D(C)) < 1 ) gives the lower bound in Lemma 3.1 
o/ |Brandes et al., 2008| . 
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Figure 3. Here we illustrate Equation [TJ The dark gray region, 
when you apply to it the, gives p(C). If you add the applied to the 
black region, you obtain my(C) (recall that my is the marginal 
probability of m^). If you add now m e applied to the light gray 
region (which is also p), of course this, being a probability, is less 
than 1. 



Lemma 7. Let C be a partition of V , submodular for fj, t ■ Then 

Qt(C)>(i-t) 

thus, ift<\, Q t (C) > 0. 

3.4. Bounds for the size of the communities in submodular partitions: 
scaling limit. 

Lemma 8. Let C be a partition of V , submodular for fi t , with \C\ > 2. Then 

(i) // C,C € C are different, then 

(9) %(6U6)> 

(ii) Assume that G is connected, let c* denote the value of the minimum cut, 
with weights m, in G. Then, for all C G C it holds 

1 c* 



m v (C) - 
















c* 




tz 







(10) \m v (C)--) < 4 /z 

(^-o) ' " 



4 tZ 

(12) — m v (C)<l-^- z 

(13 , :- : » 

References 8. Ln Eq. we showed that if two communities C, C' are connected 
(i.e if v(Ie{C x C) > 0) then 



„ / m E (C x C") 
m y (C U C") > 2W v 
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This is our version of the fundamental scaling limit found, fort — \, in |Fortunato and Barthclcmy, 2007 
(see the discussion in pp. 38-39). For a general t (called 7 in this paper) this scal- 
ing limit was considered in [Kumpula et al., 2007| . Notice that this bound is for the 
union of two connected communities. Later on we show by a toy example that a 
similar bound for one community does not hold. This example also shows that it is 
not easy to obtain, from this scaling limit, bounds on the number of communities. 

Remarks 2. For one community, the best bounds that we could obtain are in 
Eq. \l e A Given these lower bounds, of course we obtain also an upper bound for 
\C\ in Eq. \13i These bounds are not tight, but they are suggestive of a qualitative 
behavior: 

• As we shall show in our Daisy example below, there may exist very big and 
very small communities. The scaling limit shows that small communities 
will be joined to the big ones, and not between them. 

• When m is the adjacency matrix of G, c* is the connectivity, and our 
bounds suggest that for higher connectivities the sizes of the communities 
are less disperse. 

• The behavior of the bounds with respect to t are also suggestive: for big t, 
we find more communities, smaller, and with more dispersed sizes, as we 
shall later see in the examples. 

3.4.1. Daisy example, (see Figure [4j> Consider a star with a center c of degree 




Figure 4. Daisy example with r = 1. Here black lines repre- 
sent edges internal to a community, and gray lines represent edges 
between communities. In this case there is only one big commu- 
nity, formed up by the central vertex and one petal, and 24 small 
communities associated to the remaining petals. 

to, = 25r, and to homologous Tj formed by one vertex joined to the center and 
two leaves. Let C be an internally connected partition of V, and assume that no 
element of C reduces to a leave (see | Brandos et al., 2008| , Lemma 3.3: notice that 
this lemma does not generalize to arbitrary t > 1). Call Co the community where 
c lies. Then Co is formed up by the center and n < to of the Ti, and the remaining 
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elements of the partition are the remaining Cj — V(Tj). Thus, 
MCoxT j ) = — (l-t- [ 



6m \ 6m 

Then the pair Co,!} is sub-modular when n > r(6/i — 5). Let us first consider 
the case t = 1. It is easy to show that you obtain a Qi = 55 (4 — ^) optimal 
partition taking n = r and the remaining 24r Tj as components. If you increase 
r, you obtain as many modules Tj with total degree 5 as you wish. Of course, the 
number of communities in this example, 24r+l, is of the same order that Z — 150r. 

On the other hand, we would like to add this example to the section on counter- 



intuitive behavior of modularity optimization in Brandes et al., 2008 . The strong 
asymmetry in the community structure, despite the strong symmetry in the graph, 
and the arbitrary selection of r homologous Tj for the central community, are tech- 
nical artifacts. This is essentially due to the presence of a center joined to a myriad 
of small isolated communities, conditions that we can not rule out from the real 
world. 

Let t n = 5+ 6 n / r , < n < r (notice that 1 = t r < . . . < t = 6/5). Then the 
partition C* optimal for Q tn has n Tj's in the central community, and m — n small 
communities Tj. This shows the influence of t in the scaling limit. 

3.4.2. On complete binary trees. Let G be a tree and let m be its adjacency matrix. 
Then for any internally connected partition C of V G/C is also a tree, and we have 

ftP) = l-^-^-X>v( C )-^ 
This follows from our definition of Q\, noticing that 

m E {D{C)) = 1 - m E {D{C)) = 1 - 2(l< ^~ l) 
because the number of edges between communities is, in this case, \C\ — 1, and that 



2 

m vv( D(C)) = ™UC) - p + £ (my (CO - p 1 

by the well known relation between central and noncentral second order moments. 
Let s = \C\, and consider the function 

. . 2(s - 1) 1 
This function has its minimum at 

, i + VT+22, 
s =[ 2 J 

(here |_-J denotes the floor function). Of course from this we obtain the general 
bound for the optimal Qi of a tree 

Ql < !-¥>(**) 

References 9. This estimate is similar to the results obtained in |Fortunato a nd Barthclcmy, 2007 
in a very special case (see Equation 9 and its context in this paper). 
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This bound is tight for complete binary trees, because these particular graphs 
are almost regular, and then the second order moment 



may be considered negligible. This is not the case for our Daisy example where we 
find, for r = 1, Q* = 0.613 and 1 - <p(s*) = 0.782. 



Figure 5. Here we show a complete binary tree of height 5 and 
its corresponding partition Ch (in this case h = 2). The black edges 
are internal to a community, the gray ones are between communi- 
ties. 

To show this, let G be a complete binary tree of height n, for which Z = 2 n+2 — 4, 
let h — \(n — 2)/2] (here [.] stands for the ceil function) and let us consider the 
partition Ch of V formed by Rh, the vertex set of the complete binary subtree of 
height h, and the connected components that remain when you remove Rh from 
G (see Figure [5]). Then \Ch\ = 1 + 2 h+1 . (Ch is a weakly optimal partition, and 
very nearly optimal. We shall later show some cases for which it is not optimal, see 
Section [6.ip . Rather than a detailed and cumbersome proof of the fact, we show in 
the following table that Qi(Ch) »1- <^(s*), and that this approximation is better 
when n increases. 





n 



!-¥>(**) 



Qi{Ch) 



3 



0.5357143 



0.505102 



5 



0.7620968 



0.757024 



6 



0.8297258 



0.824263 



10 



0.9562724 



0.9539936 



20 



0.9986194 



0.998536 



Table 1 . Upper bounds and results for partitions Ch 
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4. Building up submodular partitions 

4.1. Basis for an algorithm. Let C be a partition of V. Let 

, r . m E (C x C") 

t(C) = max — — — 

m vv (C x C") 

where max is extended to all pairs (C, C) E C x C such that C ^ C. (if \C\ = 1, 
we set t(C) =0). We call t(C) the resolution of C. 

Lemma 9. Let C, T> be partitions of V and t > 0. Then 

(i) C is submodular for [i t if and only if t > t(C). 

(ii) IfC<V, then t(C) < t{V). 

(iii) t(C) < tiC,) 

(iv) t(C) = if and only if C <B, where B is the partition of V associated to 
the connected components of G. 

Let C be a partition of V and t > t(C). We shall use 

a(C) = m vv (D(C)) = J2^ 2 v(C) 

cec 

Z (C) = {(C,C')eCxC,C^C':iH(C,C) = 0} 

Then we have 

Lemma 10. If t > t(C) > 0, then t = t(C) if and only if Z (C) ^ 0. 

Lemma 11. Let C be a partition of V with t = t(C) > and let (C,C) <G Z n (C). 
Define a new partition V of V by 

V = {C\{C, C"})U{CUC"} 

Then V <C is submodular for fi t and 

\V\ = \C\-l 

\Z {V)\ < \Z (C)\ 

Q t (V) - Q t (C) 

a{V) = a(C) + 2mvv(C x C) 

For s <t, we obtain 

Q S (V) - Q t (V) + (t- s)a(V) > Q S (C) 

Lemma 12. Let C be a partition of V, and let t = t(C) > 0. Apply iteratively the 
scheme described in the previous lemma, until you obtain a new partition T> -< C 
ofV such that Z (V) = 0. 
Then, 

a(V) > a(C) 
t{V) < t 
QtiV) = Q t (C) 
Qt(v)(D) = Q t (C) + a(V)(t-t(V)) 



For s <t, we obtain once more 

Q S {V) > Q S (C) 
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Our algorithm is based in the last two lemmas. Starting at C = Ci, and t = i(Ci), 
we apply iteratively the scheme described in Lemma [TT] until we obtain a partition 
V such that t(T>) < t(C). Now, we update t to t(T>), C to T>, and iterate. The 
algorithm goes on while t(D) > 1 and the final result is the last T>, a submodular 
partition for fix. 

Remarks 3. After the first steps of the algorithm, we usually obtain only one 
partition for each resolution. Let us denote Ct to the first partition with resolution 
t. Then, Qt = Qt{Ct) and Q\ t = Qi(Cf). The function t — > Q t is strictly decreasing 
and convex, hence 1/t — > Q t is increasing and concave (see Figure^. The function 
1/t — > Qu is strictly increasing (see Lemma XTlS. Lemma XT^ and Figure \3(). 

At the end of each step, giving a partition T> , all the partitions V considered 
satisfy V ^ T>. This means that you can update the graph to be G/T) (doing the 
corresponding update in the weights), with a relevant gain in speed and memory. 

References 10. Later on we shall compare the performance and results of our 
algorithm with others. 

Here we want to describe briefly its relation with the algorithm described in |Blondel et al., 2 008 
that is similar in various aspects. 

Let C,T> be two partitions of V , with C ■< D. Call C submodular for fit with 
respect to T> when if C,C € C and D C C then: 

fi t (D x (C\D)) > fn(D x C) 

Notice that when T> = C this is the ordinary submodularity for fi t (see Figure^. 




C ) D D C C\D D C 



Figure 6. Here we illustrate the effect in Q t of the replacement 
in C of C by C\ D and C by C U D : we loose 2fi t (D x (C\D)), 
we gain 2/x t (D x C). 

Fixed D C C ' , select C as to maximize 

(it(DxC')- fi t (Dx(C\D)) 

and define 

M D {C) 

as the partition obtained from C by replacing, when the maximum is strictly positive 
(Mn(C) = C in the other case), C by C\D (eliminating it if it happens to be empty) 
and C by C'UD. Of course, 

Qt{M D (C)) > Q t (C) 
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(with equality holding only if Md(C)) = C ) and Md{C) ^ T>. Notice that there is 
no reason for Md{C) to be internally connected, even if C is. Consider the elements 
ofT> numerated, D\, . . . , D n , and let 

Mt> = M Dn . . . M Dl 

If we start with C = T> and define 

c( fe ) = m£(C) 

from some ko on all the 

are the same (because we are in a finite setting and Qt 
increases in each iteration), and M%?(V) = C( fc °) is submodular for [it with respect 
to T> . This is what the authors of [Blondcl et al., 2008| called one "pass". They 
start (as us) from = C\, the maximal partition, and by one pass they obtain 
£>(!) = M%>(V). Then, they update V to 2?W and proceed again the same, getting 
2?( 2 ) = Mjj(i)(I)W). Proceeding recursively in this way, at some time they obtain 
T> such that T> = Mt>(T>), which means that T> is submodular for fit- Notice, as 
the authors of Blondcl et al., 2008] did, that after a T> update, all the partitions T>' 
considered satisfy T>' <T> . This means that you can update the graph to be G/T> 
(doing the corresponding update in the weights), with a relevant gain in speed and 
memory. This happens also in our algorithm (see Remarks^. 

In [Blondcl c t al., 2008| the authors only consider the case t = 1, thus they 
obtain a partition that is submodular for Their algorithm is very fast, and 

able to deal with huge networks. They assert, and show by some example, that the 
intermediate community structures given by the algorithm overcome the scaling-limit 
problem. Perhaps the advantage, if any, of our algorithm, lies in that we obtain 
our intermediate community structures with strict control on the resolution. Notice 
that all our intermediate partitions are, by construction, internally connected. This 
is not necessarily the case for those obtained in |Blondel et al., 2008 . 

5. ON THE PROOFS 

5.1. Section [H 

5.1.1. Lemma\l\ For the first statement, notice that both tue and myv have mar- 
ginal probability my. The second statement follows from the first, adding for all 
CeC. 

5.1.2. Lemma [H We have already shown in Figure [2] how this Lemma follows, by 
graphical evidence, from the additivity of /it- 

5.1.3. Lemma\^ 

(i) This follows immediately from the definitions. 

(ii) If C* is weakly optimal, then from Lemma[2]it follows that fXt(C x C") < 
for C, C G C*, C ^ C", whence C* is submodular. 

Then, if C* is submodular and V < C* , for any D £ V 

Ht{D xD)< x C) 

cec D 

and it follows that 

Qtip) < £ E ih{C x C) 

Devcec D 

= Qt(c*) 
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Hence, C* is weakly optimal, 
(iii) This is immediate from the first statement in our Lemma [TJ 

5.1.4. Lemma^ Let C <G C and D,D' £ T>c- By our definition of T>c, there are 
no edges infix D', whence to_e(-D x D') = and it follows that /z t (D x D') < 0. 

Then 

IH(CxC)< Vt(DxD) 

DEV C 

for all C e C, whence Q t (C) < Q t (T>). 

5.1.5. Lemma [5l In Figure [3] we have already shown by graphical evidence that 
the first statement holds. The second statement follows by replacing my(C) in 
(j, t (CxC) = mE(CxC) — tmy(C) by msiCx C) + p(C). The remaining statements 
in the Lemma are easy consequences of these two. 

5.1.6. Lemma Here all is consequence of Lemma [5] In addition we used some 
general well known inequalities, that we state here for ever: 

Let Xi, yi be positive real numbers, i = 1, . . . , n. Then 

(i) E^f 

5.1.7. Lemma^ This follows from Lemma[T]if you notice that, when C is submod- 
ular for fi t , Qt(C) < 0. 

5.1.8. Lemma\^ 

(i) By the submodularity, we have 

m E (C x C") - tm v {C)m v {C) < 
whence m v {C)m v {C') > m ^ c * c ') , Now 

m^(CU C) = (my(C) + my(C")) 2 > Am v (C)m v {C) 

and the result follows. 

(ii) By the submodularity, for each C,C £ C we have 

m E (C x C) < tm v (C)m v (C) 
Sum for all C ^ C, to obtain 

m E (C X (V \ G)) < tm v (C){l - m v (C)) 
Now ZrriE{C x (V \ C)) is a cut in G, whence 

jz<m v (C)(l-m v (C)) 

Complete squares in the right, and the first inequality follows. 
As Ecec m v{G) = 1, we have 

minrMC)<p<maxrMCO 

so that the second inequality follows from the first. 
From the first inequality, we obtain 
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whence, using the well known \/l — x < 1 — x/2 for x > 0, we obtain 

and the third inequality follows. The last inequality follows from this one 
immediately. 

5.2. Section H 

5.2.1. Lemma\Q 

(i) This is obvious from the definitions. 

(ii) If C ^ T> and T> is submodular for fit, it follows immediately from the 
additivity of fit that C is also submodular for fi t . 

(iii) This follows from the previous point. 

(iv) t(C) = means that m E {C x C) = when C,C" £ C,(7 / C . But 
then the connected components of G{C) are, for any C <E C, connected 
components of G, whence the statement. 

5.2.2. LemmaWfK This follows at once from the definitions. 

5.2.3. Lemma \Tl\ All our statements follow easily from the construction of T>, per- 
haps with the exception of Zq(D) < Zq(C). For this, notice that Zq(T>) is obtained 
from Zq(C) by deleting all pairs where some coordinate is C or C , and adding the 
pairs of the form (C U C", D') for which fi t {{C U C) x D') = 0, and D' is neither 
C nor C. But (j, t ((C U C) x D') = fi t (C x D') + fi t (C X D') = implies that 
Ht{C x -D') = x -D') = 0, so that for each pair that we eventually add, we 
have deleted two (the same argument applies, of course, reversing the order in the 
coordinates). As we have deleted from Zq(C) at least (C, C"), the statement follows. 

5.2.4. Lemma UM All the statements are easy consequences of the previous lemmas. 



6. Application to networks 

We implemented our algorithm for building submodular partitions in C++; the 
source code is available on SourceForge DcltaCo m, 2010] . 

Here, we compare our results with those obtained by other methods: the algo- 
rithm of Newman based on the spectrum of the modularity mat rix [Newman, 2006] ; 
the algorithm by Duch and Arenas using extremal optimization Duc h and Arenas, 2005| ; 
the fast, greedy algorithm of Clauset et al., 2004] ; and the hierarchical fast- unfolding 



method of Blondcl et al., 2008] . We omit previous algorithms, like the betweenness- 
based Girvan-Newman method, the spectrum-based ones and the simulated anneal- 



ing method of Guimera et al. Guimera and Nunes Amaral, 2005 , which are rather 



slow and size limited. We analyze binary trees and some real networks. 

6.1. Binary trees revisited. In section [3.4.21 we had found an upper bound for 
modularity on complete binary trees. Here, we apply our algorithm to them, and 
show the results in Table[2] with a comparison to Blondel's algorithm Blondcl et al., 2008 



Both provide similar results, and quite close to the bounds in Tabled] 

We also provide a visualization of a submodular community partition for trees of 
height 5, in Figure[7] Notice the subtle differences with the Ch partition in Figure[S] 
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n 


Blondel 


this paper 


5 


0.758195 


0.758195 


6 


0.821712 


0.821051 


7 


0.876850 


0.877364 


10 


0.953032 


0.953219 



Table 2. Newman's modularity for binary trees of height n. 



Figure 7. Communities obtained by Blondel's algorithm and by 
this paper. We remark that for a tree of height 5, both achieve the 
same result. 



6.2. Real networks. Table [3] displays the results of Q\ for different common test 
networks: a karate club network studied by Zachary Zachary, 1977 , a network of 
email interchanges at university compiled by Guimera et al. |Guimera et al~7 2003 , 



a metabolic network from the C. elegans |Duch and Arenas, 2005 , a set of scien- 



Newman, 2001], 



tific co-citations in arXiv |CUP, 2003] , a trust network of users of the PGP algo- 
rithm | Bogmi a et al., 2004| , a coauthorship network on condensed matter physics 
the nd.edu domain of the www |Albert et al., 1999] , a web graph from Google jLeskovec et al., 2 008 

and an Internet map at the inter-router level obtained with DIMES |DIMES, 2005] . 

This comparison table is similar to those found in Newman, "2006] and |Duch and Arenas, 2005] . 

We observe that our algorithm gives better results in terms of modularity relative 
to the method of Clauset et al. For big networks, we also improve results by 
Newman and Duch- Arenas (not so for the smallest networks). 

For larger real networks many of these methods fail, as their algorithmic com- 
plexity is too high. In those cases, we provide a comparison with the Blondel fast 
algorithm 



Blondel et al., 2008 



which is also scalable and publicly available. It 
gives the best results for very large networks, as far as we know. 

To end this section, figures [8] to QT] display how the resolution t and the modu- 
larity Q evolve for different networks. 
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Network 


Size 


Newman 


Duch- Arenas 


Clauset et ol. 


Blondel 


this paper 


karate 


O A 

o4 


n din 

U.41y 


a a 1 a 


A 001 


U.41y 


a Af\t: 

U.4U5 


dolphins 


oz 








A C 1 A 

U.oly 


a enc 

U.oUo 


email 




U.o i z 


U.O 1 4 


n /io/i 

U.4y4 


U.40 ( 


U.0z4 


metabolic 


453 


0.435 


0.434 


0.402 


0.438 


0.419 


arxiv 


9377 




0.770 


0.772 


0.813 


0.797 


key signing 


10680 


0.855 


0.846 


0.733 


0.884 


0.864 


condmat 


27519 


0.723 


0.679 


0.668 


0.750 


0.723 


web-nd 


325729 








0.935 


0.935 


web-google 


875712 








0.978 


0.968 


ir_dim.es 


976025 








0.845 


0.839 



Table 3. Comparison of Newman's modularity for some real net- 



works using different algorithms. 




Figure 8. Evolution of Q\(Ct) for some networks. We see that 
the last resolution strongly deppends on the size of the network. 
Even for big networks, the optimal values are reached near 1. No- 
tice that the first few values of Qu are negative; we do not plot 
them. 



7. Conclusions 

In this paper we have shown several properties of the modularity with resolution 
parameter for weighted graphs, using systematically our version of the definition. 
Several of these properties were known in special cases, as we have mentioned in 
detail in our reference sections; some of them are new. We introduced a notion 
of weak optimality of a partition, and we described an algorithm to obtain weakly 
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Figure 9. Evolution of Q t {C t ) for some networks. Notice that 
the first values of Q t are negative. We do not plot them. 




Figure 10. Comparison of Qt(C t ) and Qi(C t ) for the ir.dimes 
network. The closeness of both curves for t = 1 is attributable to 
a small second order moment of the sizes m v (C), and to the greast 
number of communities. 



optimal partitions. We have shown that this algorithm is able to deal with huge 
networks, and that the resulting values of modularity are comparable to those 
obtained by some of the known optimization algorithms. 

We showed that the known limitation of modularity optimization, its scaling 
limit, is also a limitation for weak optimality. The introduction of the resolution 
parameter t partially solves this limitation: for t > 1 there are weaker restrictions, 
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Figure 1 1 . For karate, a small network, we see a greater contrast 
between Qi(C t ) and Qt(C t ). 

but we feel that it is necessary to make a deeper modification in the modular- 
ity to obtain, through its optimization, community structures that satisfy natural 
specifications. 
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